The interaction of disease transmission, mortality, and economic output over the first 2 years of the COVID-19 pandemic

Background The COVID-19 pandemic has caused over 7.02 million deaths as of January 2024 and profoundly affected most countries’ Gross Domestic Product (GDP). Here, we study the interaction of SARS-CoV-2 transmission, mortality, and economic output between January 2020 and December 2022 across 25 European countries. Methods We use a Bayesian mixed effects model with auto-regressive terms to estimate the temporal relationships between disease transmission, excess deaths, changes in economic output, transit mobility and non-pharmaceutical interventions (NPIs) across countries. Results Disease transmission intensity (logRt) decreases GDP and increases excess deaths, where the latter association is longer-lasting. Changes in GDP as well as prior week transmission intensity are both negatively associated with each other (-0.241, 95% CrI: -0.295 - -0.189). We find evidence of risk-averse behaviour, as changes in transit and prior week transmission intensity are negatively associated (-0.055, 95% CrI: -0.074 to -0.036). Our results highlight a complex cost-benefit trade-off from individual NPIs. For example, banning international travel is associated with both increases in GDP (0.014, 0.002—0.025) and decreases in excess deaths (-0.014, 95% CrI: -0.028 - -0.001). Country-specific random effects, such as the poverty rate, are positively associated with excess deaths while the UN government effectiveness index is negatively associated with excess deaths. Interpretation The interplay between transmission intensity, excess deaths, population mobility and economic output is highly complex, and none of these factors can be considered in isolation. Our results reinforce the intuitive idea that significant economic activity arises from diverse person-to-person interactions. Our analysis quantifies and highlights that the impact of disease on a given country is complex and multifaceted. Long-term economic impairments are not fully captured by our model, as well as long-term disease effects (Long COVID).


A Pandemic information / data plots
We consider data for the SARS-CoV-2 pandemic for 25 European countries from 1 Jan 2020 to 31 Dec 2021.The start of the data availability varies by country as a function of the start of the pandemic and data collection in that country.
A.1 Response Variables, by country, over the full period

A.3 Vaccination & Variant data
Our World In Data [4] provides a range of vaccination statistics.We use the total number of vaccine doses administered over time.dividing by the total population [5] to calculate the number of doses delivered per capita.SARS-CoV-2 variant data is sourced from [6], which provides a summary view by country and week based on GISAID data.We utilise the Nextstrain Clade, Pango Lineage and WHO Label mapping to map all Nextstrain Clades into WHO Labels, for use in the model.For each week we pick the dominant strain in each country as the strain with the majority of sequenced samples (Figure A.9).We observe a data gap for Hungary, which is missing weeks 9-45 in 2021.We estimate data for Hungary during this time period by using the average of the data from surrounding countries, weighted by border length.A.6 Country Characteristic Data We consider a wide range of data in the analysis for which we list a high level overview in Table A. 4.This is a mix of time varying variables (such as cases or excess deaths) and variables which we consider time invariant, as they changes over a time frame longer than the time frame of our analysis.We will use these time invariant data as country characteristics in our analysis.In particular we list the country characteristics which we consider in our analysis.We provide the name of the data item (which can be retrieved from the data libraries listed in Table A.4), the latest year for which we have data available and the organisation providing the data:

B.1 Reproduction Numbers using EpiNow2
We use instantaneous reproduction numbers computed by the London School of Hygine and Tropical Medicine (LSHTM) using the EpiNow2 model [1] (the model and code is freely available on github and epiforecasts).

B.2 Model specifications
We consider 3 model specifications.Model 2 is described in the main text and results are provided.
The estimation results for model 1 and 3 are provided in this supplement.
Model 1 First we consider the interaction of transmission intensity, excess deaths, changes in NPIs, changes in GDP and changes in transit behaviour.Changes in NPIs are given by changes to the overall stringency index for each country over the respective period of time.The only covariates are vaccination, given by the average number of vaccinations per person for a given country at time t and the dominant variant of SARS-CoV-2.
In this model the predictor coefficients are common across countries, and we have country specific intercepts (random effects) µ c for each country c.
where : Here we consider individual NPIs to be covariates to the response variables.The implementation of NPIs, both in timing and level of severity, have varied between countries.Early in the pandemic (March to September 2020) implementations of mandates were reasonably homogeneous, but we subsequently observe significant heterogeneity, both in time and amongst countries as we observed divergent objectives by policy makers and debates in society on mandates.In this model we consider lagged NPIs.
where : In addition to the coefficients used in Model 1, we have additional coefficients λ and δ, that respectively denote the (fixed-effects) level and changes in NPIs.
Model 3 Finally we consider a model with only one response variable and consider the impact of NPIs, vaccination and dominant variant only.This type of model can be considered a simple extension to [28,29].
where : x t = NPI t−1 vacc t = Average Vaccinations per person t

B.3 Identification strategy
Example for N = 2: (5) where we need to solve for four unknowns but only have 3 equations.

C Estimation results
We provide an overview of results for all 3 models in this section.All results can be reproduced with the data and code available on GitHub (https://github.com/cm401/covid_eco_epi_var).

C.1 Model 1
For Model 1 we focus on the results of the Orthogonal Impulse Response Function which we provide in Figure C.13.A positive impulse in transmission intensity leads to an increase in excess deaths, peaking after 5 weeks, and the decay of the impulse is slow and prolonged.A shorter but positive response can also be seen in changes to the overall stringency of NPIs.We also observe a significant and negative response in economic activity but no significant impact on changes in transit behaviour.
The effect to a positive impulse in excess deaths is smaller than that for transmission intensity.A positive impulse to changes in the overall stringency of NPIs has only a positive effect on itself but no other variable.Considering economic activity, we observe that a positive impulse leads to negative changes in overall stringency of NPIs, a positive effect on economic activity itself and a significant positive effect on transit mobility in the next time period.

C.2 Model 2
We provide the coefficient estimates for the model in 5 tables: 1. Estimates of the VAR coefficients Φ i,j,k=1 in Table C

C.3 Model 3
For Model 3 we do not have vector auto-regressive coefficients (although we do have auto-regressive coefficients for each one of our response variables).

C.5 Forecast comparison
We are interested in the causal relationships of our response variables.Clearly, a cause cannot precede its the effect.If a variable x affects another variable y, the former should help improve the predictions of the latter.We again use LOO-CV to estimate the pointwise out-of-sample prediction accuracy from our fitted model.We systematically remove predictive variables from the model: if variable x is removed, it is removed from all equations, except the equation of x on itself (e.g.removing logR removes it from the right hand side of the equations of logED, ∆ GDP and ∆ Transit, but not the right hand side of logR itself).We do this for each response variable individually, all combinations of two response variables and removing all response variables (in which case each variable uses only itself).
The results are summarised in Table C.11, ordered by decreasing expected log pointwise predictive density (ELPD) difference.
The full model containing all variables has the best ELPD, although models excluding ∆ Transit, ∆ GDP or both of these are not statistically different at the 95% level from the full model.Excluding logR or logED, and any combination containing these predictors, the difference is statistically significant.logR is the most important predictive variable, as it leads to the largest negative ELPD differences, either on its own (compared to other single variable exclusions) or in combination with other variables excluded.Excluding all variables, so that each response variable is a function of only its past observations, leads to the largest ELPD negative difference indicating the worst performing model with respect to sample prediction accuracy.Together with the above vector-autoregressive coefficient estimates and the impulse response function, we have further evidence that transmission intensity increases excess deaths and negatively changes GDP.
To further assess the predictive performance of our model, we compare the 1-timestep ahead forecast from our model to the naive forecast, which assumes the next value of a response variable is the same as its current value, that is x(t+1) = x(t).We find that our model improves predictions of all response variables, with a reduction of root mean squared error of 8% for transmission intensity, 6% for excess deaths, 27% for ∆ GDP and 32% for ∆ Transit (Figure C.19).The improvements are due to improved forecasts in the tail of the distributions as opposed to the centre.This is expected, as the centre represents relatively stationary time periods.The model however does not have any awareness of e.g.holiday effects or other exogenous dynamics, and hence does not outperform the naive forecast during these periods.In the tails, i.e. during periods of high transmission intensity, the model performs better than the naive forecast.We also consider robustness for parameter choices in our model and estimation, such as τ .The variable selection broadly follows the approach taken in [28].
There are a range of different specification for the model.Rather than estimating the model with no intercept we could introduce a global intercept and impose a constraint on the wild type variant coefficient to equal zero.The estimated coefficients are essentially unchanged by this.).This implies that the posterior has converged and can be used for inference.
We used 2000 warmup samples and 2000 iterations per chain, for 4 chains.We also see that the relative effective sample size exceeds 0.5 for the majority of parameters, indicating low autocorrelation and fast mixing.

Figure A. 1 :Figure A. 2 :Figure A. 3 :
Figure A.1: log R (transmission intensity) over the course of 2020-21.We can observe heterogeneity across both countries and time.

Figure A. 4 :
Figure A.4: Indexed GDP over the course of 2020-21.We can observe heterogeneity across both countries and time.

Figure A. 6 : 3 -
Figure A.6: Non-pharmaceutical Interventions: Time-series of NPIs over the cause of the pandemic by country.The blue line is the smoothed pan-European average for each NPI.

Figure A. 8 :
Figure A.8: Vaccinations: Average vaccination per person (total number of vaccines delivered divided by total population).

Figure A. 9 :Figure A. 11 :
Figure A.9: SARS-CoV-2 Dominant Variant: Largest Variant as measured by sequenced samples over time in each country.

Figure C. 13 :
Figure C.13: Model 1 Orthogonal IRF: Columns are the variables to which the shock is applied to, rows are the variables which we observe the impulse response for.Mean estimates (solid black) and 95% CrI (dotted red).

Figure C. 20 :
Figure C.20: Country Sensitivity Analysis: Marginal posterior density 90% confidence interval and mean for coefficients of the VAR component.Black line is the estimate for the full model, each colour represents the estimate of the model excluding that country.

Figure C. 21 :
Figure C.21: MCMC convergence statistics (taken from a run using model values for all parameters as outlined in Table 1).Left: R values are close to 1, indicating convergence.Right: relative Effective sample size .

Table A
[3] OxCGRT Codebook Containment and closure policies[3]: Missing data will be represented as blank in the database, Coding 0 will be applied if no measure was in place at that moment in time.All Measures are at Ordinal Scale.aNote: this records policy for foreign travellers, not citizens Table A.2: OxCGRT Codebook economic policies[3]: Missing data will be represented as blank in the database, Coding 0 will be applied if no measure or support was in place at that moment in time.aNote: only includes payments to firms if explicitly linked to payroll/salaries

Table A .
4: Country Characteristics: Overview of data items, reference year and coverage of countries.Abbreviations for data sources are EuroStat (ES), Heritage Foundation (HF), World Bank (WB), International Monetary Fund (IMF), United Nations (UN), Freedom House (FH) Ψ j,t,c is the coefficient for the j th dominant variant of SARS-CoV-2, at time t for country c.Ψ is constructed such that Ψ W T,•,• is always 1 (and acts as an intercept term), Ψ Alpha,•,• is 1 unless Wildtype is the dominant SARS-CoV-2 variant (in which case it is 0), Ψ Delta,•,• is 1 unless Wildtype or AlpgaAlpha are the dominant variants and Ψ Omicron,•,• is 1 only if Omicron is the dominant variant.
y 2t,c , y 3t,c , y 4t,c ⊤ = log R t , log Excess Deaths t , ∆NPI t , ∆GDP t , ∆Transit t , ⊤(2)vacc t = Average Vaccinations per person t Y t,c is a multivariate normal random variable with covariance matrix Σ u .y t,c is a vector containing the values of the response variables, as defined in Equation2, at time t for country c.Φ k is the N × N coefficient matrix of the vector auto-regressive component for lag k. ν is the coefficient for vaccination (defined as the average number of vaccinations per person at time t for country c).
log Excess Deaths t , ∆GDP t , ∆Transit t , ⊤x t = NPI t−1 vacc t = Average Vaccinations per person t

Table C
.5 2. Estimates of the coefficients for NPI levels λ in Table C.6 3. Estimates of the coefficients for NPI changes δ in Table C.7 4. Estimates for the Dominant Varient coefficients ψ in Table C.8 5. Estimates for Vaccination coefficients ν in Table C.9In the results section of the main text we considered the model choice for the specification of Model 2 with .5: Estimates for VAR coefficients Φ i,j,k=1 of the model • NPI changes • NPI levels We can run the model with exactly the same setup but use only NPI changes (LHS of Figure C.14) or use only NPI levels (RHS of Figure C.14).Tables with the numeric values of these estimates can be generated from the model output using the GitHub code.

Table C .
15e results for the effect size of NPIs on each of the response variables is displayed in FigureC.15.We observe that estimates are different from Model 2 but substantively the same (sign of estimates are mostly the same and the factors which are significant are broadly similar).10: Covariates: Correlation of country specific characteristics to country specific response variable intercepts across Government, Societal, economic and Healthcare characteristics.
Figure C.18: Country Effects: Country Level Effects of Response Variables for all Response variable combinations.We highlight the 4 quadrants to display where countries are in the representation.

Table C .
11: Leave-One-Out cross-validation (LOO-CV) to estimate the pointwise out of sample prediction accuracy of fitted Bayesian models.The full model is that described in the main text, excluded variables are removed from the prediction variables of the model (except for variables on themselves).Expected log pointwise predictive density (ELPD) differences and Standard Errors are reported.